-
Notifications
You must be signed in to change notification settings - Fork 244
Rationalize and try to fix failing ldiv tests #2809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kshyatt
wants to merge
2
commits into
master
Choose a base branch
from
ksh/interfaces_fix
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+10
−12
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/test/libraries/cusparse/interfaces.jl b/test/libraries/cusparse/interfaces.jl
index fa25d8330..34f9d75f8 100644
--- a/test/libraries/cusparse/interfaces.jl
+++ b/test/libraries/cusparse/interfaces.jl
@@ -258,7 +258,7 @@ nB = 2
end
end
@testset "\\ -- CuMatrix" begin
- C = triangle(opa(A)) \ opb(B)
+ C = triangle(opa(A)) \ opb(B)
dC = triangle(opa(dA)) \ opb(dB)
@test C ≈ collect(dC)
if CUSPARSE.version() < v"12.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDA.jl Benchmarks
Benchmark suite | Current: ef0395c | Previous: 4f38802 | Ratio |
---|---|---|---|
latency/precompile |
42834310121 ns |
42824416725 ns |
1.00 |
latency/ttfp |
7055511558 ns |
7051266950 ns |
1.00 |
latency/import |
3581529444 ns |
3574411987 ns |
1.00 |
integration/volumerhs |
9609883 ns |
9608389 ns |
1.00 |
integration/byval/slices=1 |
146843 ns |
146872 ns |
1.00 |
integration/byval/slices=3 |
425595 ns |
425794 ns |
1.00 |
integration/byval/reference |
145040 ns |
144942 ns |
1.00 |
integration/byval/slices=2 |
286222 ns |
286144 ns |
1.00 |
integration/cudadevrt |
103486 ns |
103388 ns |
1.00 |
kernel/indexing |
14275 ns |
14276 ns |
1.00 |
kernel/indexing_checked |
14903 ns |
15083 ns |
0.99 |
kernel/occupancy |
665.1635220125786 ns |
677.6114649681529 ns |
0.98 |
kernel/launch |
2084.9 ns |
2157.8888888888887 ns |
0.97 |
kernel/rand |
15789 ns |
14900 ns |
1.06 |
array/reverse/1d |
19840 ns |
20028 ns |
0.99 |
array/reverse/2d |
24759 ns |
25007 ns |
0.99 |
array/reverse/1d_inplace |
10572 ns |
10952 ns |
0.97 |
array/reverse/2d_inplace |
12164 ns |
12545 ns |
0.97 |
array/copy |
21058 ns |
21084 ns |
1.00 |
array/iteration/findall/int |
156440 ns |
158043.5 ns |
0.99 |
array/iteration/findall/bool |
139111 ns |
140007 ns |
0.99 |
array/iteration/findfirst/int |
161781 ns |
164557.5 ns |
0.98 |
array/iteration/findfirst/bool |
163534.5 ns |
167385 ns |
0.98 |
array/iteration/scalar |
71885 ns |
74295 ns |
0.97 |
array/iteration/logical |
211491.5 ns |
215875.5 ns |
0.98 |
array/iteration/findmin/1d |
46205 ns |
47331 ns |
0.98 |
array/iteration/findmin/2d |
96280 ns |
97017 ns |
0.99 |
array/reductions/reduce/Int64/1d |
41975.5 ns |
43072.5 ns |
0.97 |
array/reductions/reduce/Int64/dims=1 |
45421 ns |
55698.5 ns |
0.82 |
array/reductions/reduce/Int64/dims=2 |
61650 ns |
62572.5 ns |
0.99 |
array/reductions/reduce/Int64/dims=1L |
88841 ns |
89129 ns |
1.00 |
array/reductions/reduce/Int64/dims=2L |
87080 ns |
88184.5 ns |
0.99 |
array/reductions/reduce/Float32/1d |
34328 ns |
35313 ns |
0.97 |
array/reductions/reduce/Float32/dims=1 |
44565.5 ns |
51818 ns |
0.86 |
array/reductions/reduce/Float32/dims=2 |
59597 ns |
59835 ns |
1.00 |
array/reductions/reduce/Float32/dims=1L |
52401 ns |
52336 ns |
1.00 |
array/reductions/reduce/Float32/dims=2L |
70127 ns |
70233.5 ns |
1.00 |
array/reductions/mapreduce/Int64/1d |
42310 ns |
44093 ns |
0.96 |
array/reductions/mapreduce/Int64/dims=1 |
46824 ns |
47633.5 ns |
0.98 |
array/reductions/mapreduce/Int64/dims=2 |
61836 ns |
62709 ns |
0.99 |
array/reductions/mapreduce/Int64/dims=1L |
89032 ns |
89036 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2L |
87169 ns |
87347.5 ns |
1.00 |
array/reductions/mapreduce/Float32/1d |
34370 ns |
34780.5 ns |
0.99 |
array/reductions/mapreduce/Float32/dims=1 |
41835 ns |
41996.5 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2 |
60443 ns |
60450.5 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=1L |
52707 ns |
52739 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2L |
70578 ns |
70715 ns |
1.00 |
array/broadcast |
20213 ns |
20360 ns |
0.99 |
array/copyto!/gpu_to_gpu |
12757 ns |
12890 ns |
0.99 |
array/copyto!/cpu_to_gpu |
215578 ns |
217680 ns |
0.99 |
array/copyto!/gpu_to_cpu |
284650.5 ns |
286671 ns |
0.99 |
array/accumulate/Int64/1d |
124951 ns |
125190 ns |
1.00 |
array/accumulate/Int64/dims=1 |
83369 ns |
84136 ns |
0.99 |
array/accumulate/Int64/dims=2 |
157877 ns |
158690 ns |
0.99 |
array/accumulate/Int64/dims=1L |
1710070 ns |
1709534 ns |
1.00 |
array/accumulate/Int64/dims=2L |
966339 ns |
967437 ns |
1.00 |
array/accumulate/Float32/1d |
109036.5 ns |
109803 ns |
0.99 |
array/accumulate/Float32/dims=1 |
80422.5 ns |
81170 ns |
0.99 |
array/accumulate/Float32/dims=2 |
147586 ns |
147834 ns |
1.00 |
array/accumulate/Float32/dims=1L |
1618609 ns |
1619112.5 ns |
1.00 |
array/accumulate/Float32/dims=2L |
698220 ns |
698583 ns |
1.00 |
array/construct |
1284.9 ns |
1275.8 ns |
1.01 |
array/random/randn/Float32 |
43542.5 ns |
44761 ns |
0.97 |
array/random/randn!/Float32 |
24899 ns |
25104 ns |
0.99 |
array/random/rand!/Int64 |
27523 ns |
27468 ns |
1.00 |
array/random/rand!/Float32 |
8803.666666666666 ns |
8662 ns |
1.02 |
array/random/rand/Int64 |
38172 ns |
30080 ns |
1.27 |
array/random/rand/Float32 |
13130 ns |
13152 ns |
1.00 |
array/permutedims/4d |
60326.5 ns |
60473 ns |
1.00 |
array/permutedims/2d |
53930 ns |
54524 ns |
0.99 |
array/permutedims/3d |
54833 ns |
55468 ns |
0.99 |
array/sorting/1d |
2757985 ns |
2763710 ns |
1.00 |
array/sorting/by |
3344404 ns |
3356377 ns |
1.00 |
array/sorting/2d |
1080451 ns |
1085339 ns |
1.00 |
cuda/synchronization/stream/auto |
1026.7857142857142 ns |
1018.0909090909091 ns |
1.01 |
cuda/synchronization/stream/nonblocking |
8057.6 ns |
7602.700000000001 ns |
1.06 |
cuda/synchronization/stream/blocking |
795.1456310679612 ns |
806.236559139785 ns |
0.99 |
cuda/synchronization/context/auto |
1172.1 ns |
1183.8 ns |
0.99 |
cuda/synchronization/context/nonblocking |
8369.599999999999 ns |
7801 ns |
1.07 |
cuda/synchronization/context/blocking |
914.7560975609756 ns |
897.2923076923076 ns |
1.02 |
This comment was automatically generated by workflow using github-action-benchmark.
maleadt
approved these changes
Jul 3, 2025
5db6744
to
ef0395c
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Trying to fix intermittently failing CI. Doesn't make sense to have these checks for only one of the inplace/not-inplace versions. Hopefully this helps stability.